Model Selection

General Visual Representation

# General Visual Representation

Sam2 Hiera Small.fb R896 2pt1

SAM2 weights (HieraDet image encoder only) based on the timm library, derived from Facebook's Hiera small model.

Image Segmentation

CLIP ViT B 32 CommonPool.M.text S128m B4k

A vision-language model based on the CLIP architecture, supporting zero-shot image classification tasks

CLIP ViT B 32 CommonPool.M S128m B4k

Zero-shot image classification model based on CLIP architecture, supporting general vision-language tasks

CLIP ViT B 32 CommonPool.S.basic S13m B4k

A vision-language model based on the CLIP architecture, supporting zero-shot image classification tasks

Featured Recommended AI Models

AIbase

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

© 2025AIbase